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Abstract. If f{xi, . . . , Xn) is a polynomial dependent on a largo number of 
independent Bernoulli random variables, what can be said about the maximum 
concentration of / on any single value? For linear polynomials, this reduces 
to one version of the classical Littlewood-OfTord problem: Given nonzero con- 
stants ai, . . . a„, what is the maximum number of sums of the form ±ai ± 
02 ■ ■ ■ ± Qn which take on any single value? Here we consider the case where 
/ is either a bilinear form or a quadratic form. For the bilinear case, we 
show that the only forms having concentration significantly larger than ra~^ 
are those which are in a certain sense very close to being degenerate. For the 
quadratic case, we show that no form having many nonzero coefficients has 
concentration significantly larger than n~^/^. In both cases the results are 
nearly tight. 



1. Introduction: The Linear Littlewood-Offord Problem 

In their study of the distribution of the number of real roots of random polynomials, 
Littlewood and Offord [10] encountered the following problem: 

Question 1. Let ai, ... a„ be real numbers such that \ai\ > 1 for every i. What 
is the largest number of the 2" sums of the form 

±ai ± 02 • • • ± a„ 

that can lie in any interval of length 1 ? 

Littlewood and OfFord showed an upper bound of 0(2"^!^^) on the number of 
such sums. Erdos [4] later removed the logn factor from this result, giving an 
exact bound of (^,j"2j) Sperner's Lemma, which is tight in the case where all 
of the Oi are equal. The same bound was later shown by Kleitman [8] in the case 
where the are complex numbers. Rescaling Kleitman's result and using Sterling's 
approximation gives the following probabilistic variant of the lemma: 

Theorem 1. Let n > 0, and let ai, . . .a„ he arbitrary complex numbers, at least 
m>\ of which are nonzero. Let xi, . . . a;„ be independent random variables drawn 



uniformly /rom { 1 , — 1 } . Then 

n 

sup P(V' aiXi = c) < min{-, —=} 
cgJx i=l V 



1 1 



This research was supported by NSF Grants DMS-0635607 and DMS-0456611. 

1 



2 



KEVIN P. COSTELLO 



In a sense Theorem 1 can be thought of as a quantitative description of the disper- 
sion of a random walk: No matter what step sizes the walk takes, as the number of 
steps increases the walk becomes less and less concentrated on any particular value. 
In this interpretation the yjn in the bound is also unsurprising; if the step sizes are 
small integers, we would expect the walk to typically be about at an integer about 
0{^\fn) distance from at time n, so the concentration at individual points near 
should be roughly n~^/^. 

In 1977 Halasz [5] gave several far reaching generalizations of Theorem 1, both to 
higher dimensions and to more general classes of random variables. One (rescaled) 
result of his is 

Theorem 2. Let a\,.. .an he vectors in R** such that no proper subspace of R'' 
contains more than n — m of the a, . Let xi,. . .Xn be independent complex-valued 
random variables such that for some p < 1, 

supP(a;i = c) < p. 

i,c 

Then 

n 

sup P{Y,"-i^i = c)= Op,d(m-'^/2)_ 
ceR i=i 

The original Littlcwood-Offord lemma corresponds to the special case where d = 1 
and the Xi are iid Bernoulli variables. Again this can be thought of as a disper- 
sion result: a linear polynomial which depends on a large number of independent, 
moderately dispersed random variables will itself be very dispersed. Furthermore, 
the dispersion will be greater if the coefHcients of the polynomial are in some sense 
truly d— dimensional. 

One application of these results is in the study of random matrices, since several key 
parameters of a matrix (e.g. the determinant, or the distance from one row to the 
span of the remaining rows) are linear forms in the entry of a single row or column 
of the matrix. Komlos [9] used Theorem 1 in 1967 to show that a random Bernoulli 
matrix (one whose entries are independently either 1 or -1) is almost surely non- 
singular. Later, Kahn, Komlos and Szemercdi [7] used the ideas of Halasz to show 
that the singularity probability was exponentially small of the size of the matrix. 
The current best bound for this probability, + o(l))" for an n x n matrix [1], 
comes from a detailed analysis of the inverse of the Littlewood-Offord problem, 
which can be thought of as 

Question 2. If^aiXi is highly concentrated on one value, what can be said about 
the ai ? 

The intuition here if the sum takes on a single value with probability close to n^^/^, 
then the aj should be very highly structured. Tao and Vu [13] and Rudelson and 
Vershynin [11] showed that this was in fact the case: If the sum takes on a single 
value with probability at least n~'^ for some fixed c, then the coefficients must have 
been drawn from a short generalized arithmetic progression. One special case of 
this result can be expressed more quantitatively in the following theorem from [15] 
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Theorem 3. Let oi, . . . a„ be nonzero complex numbers, and let e < and a > 6e 
fixed. Then there is an Nq = Ni){c, a) such that ifn>No and for Xi independently 
and uniformly chosen from {1,-1} 

n 

PiJ2aiXi = c)>n-'/^-\ 

i=l 

then there is a d € R such that if n > Nq all but n^~" of the Ui have the form 

Qi ^ dbi, 

where the bi are integers such that \bi\ < n^"*"". 

The same holds true if the Xi are independent and identically distributed "lazy 
walker" variables satisfying P{xi = 0) = 2p, P(xj = 1) = P{xi = —1) = 1 — p for 
some < p <1 (No is now also dependent on p). 

2. Statement of Main Results 

Our goal here will be to develop and strengthen extensions of Theorem 1 and related 
results to polynomials of higher degree, in particular bilinear and quadratic forms. 
To begin, let us consider the following result (implicit in [2]), which we reprove here 
for convenience: 

Theorem 4. Let A = aij,l < i < m,l < j < n be a,n array of complex numbers, 
and suppose that at least r distinct rows of A each contain at least r nonzero entries. 
Let x = {x\,... Xm) and y = (yi, . . . be two vectors whose m + n entries are 
random variables independently and uniformly chosen from {1, —1}. Then 

m n 

s\xpP{x'^ Ay = '^^^aijXiyj = c) = 0(r~^/^) 
" i=i j=i 

Proof : Without loss of generality we may assume that the rows in question cor- 
respond to the variables x\ through Xr- 

Let Wi = '^ijUj^ ''^^'^ l'3t W denote the number of i between 1 and r for which 
Wi is equal to 0. We have 

Pix"^ Ay = c)< P{W >^) + Pix"^ Ay = cAW <^). 

We bound each term separately. For the first term, we view W as a sum of the 

indicator function of the events that each Wi is equal to 0. Since by Theorem 1 each 
Wi is equal to with probability 0(r~^/^), it follows from linearity of expectation 
that E(M^) = 0(r^/^), and therefore from Markov's inequality that 

P(VK > ^) = 0(r-i/2). 

For the second term, we treat y as fixed and write 

x'^Ay = ^WiXi. 
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If W is at most j, then the right hand side is a hnear form in the with at least 
^ nonzero coefficients. It follows from Theorem 1 and taking expectations over y 
that this term is 0(r~^/^). ■ 

In a certain sense this is a weaker result than we might expect. If ^ is an n x n 
matrix of small nonzero integers, then the magnitude of Ay will typically be 
around n, so we might expect a concentration probability of instead of 
However, Theorem 4 is tight, as the polynomial (xi + ... + Xn){yi + ... + yn) shows. 
What our first main result shows is that every bilinear form with sufficiently large 
concentration probability is in some sense close to this degenerate example. 

Theorem 5. Fix e > 0. Let x = {x\,. . . Xm) and y = (yi, . . . yn) be independent 
random vectors whose entries are uniformly chosen from {1,-1}, and suppose that 
for some r < m every row of the coefficient matrix A of the bilinear form x^ Ay 
contains at least r nonzero entries. If r and m are sufficiently large and there is a 
function f such that 

V{x^Ay = f{y)) > (1) 

then A contains a rank one submatrix of size at least (m — Od ^^^a^ )) x (n — 
Oe( j^gii ^ )) (here the constant in the 0{) notation is as r tends to infinity and 
is allowed to depend on e). 

The same holds true if (1) holds when the entries of y are independently set equal 
to (with probability 1/2) or ±1 (with probability 1/4 each). 

In particular, this holds for the case where f{y) = c is constant. 

Remark 1. Note that we now require the stronger condition that every row have 
many nonzero entries. If this does not hold, we can first expose the Xi corresponding 
to rows with few nonzero entries, then apply Theorem 5 to the bilinear form on 

the remaining variables. It follows that the rows of A having many nonzero entries 
must correspond almost entirely to a rank one submatrix. 

Remark 2. The —1 in the exponent is sharp. If A is a small integer matrix, then 
x-^Ay will typically be on the order of n in absolute value, so by the pigeonhole 
principle some value is taken on with probability i}(n~^). However, a randomly 
chosen such A will with high probability not have rank one submatrices of size 
larger than O(logn). 

In terms of the original bilinear form, a rank one submatrix corresponds to a form 
which factors completely as x'^Ay = g{x)h{y). Theorem 5 states that any bilinear 
form with sufficiently large concentration probability is highly structured in the 
sense that it can be made into one which factors by setting only a small portion of 
the variables equal to 0. 

We next turn our attention to quadratic forms x'^ Ax, where x is again random. 

Here we first aim to show 

Theorem 6. Let A be an n x n symmetric matrix of complex numbers such that 
every row of A has at least r nonzero entries, where r > ea;p((lnn)^/^), and let 
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L he an arbitrary linear form. Let x he a vector of length n with entries chosen 
uniformly and independently from {1,-1}. Let 5 > Q he fixed. Then 

supP^ := P(a;^Aa; = L{x) + c) = 05{r-^'^+^). 

c 

In particular, the above hound holds for the case where L{x) is identically 0. 

We will then remove the assumption that every row of A have many nonzero entries, 
obtaining the following corollary which may be easier to apply in practice 

Corollary 1. Let A he an n x n symmetric matrix of complex numbers such that 
at least ran of the entries of A are nonzero, where m > 3exp{{\nny^^). Let L and 
x be above. Then for any 5 > Q, 

sup Pa := ^{x'^Ax = L{x) + c) = Osim-^/^+^). 

c 

Remark 3. Again the 1/2 is sharp, as can be seen from the form 

{X\ H \-Xn){xi H \-Xm)- 

A weaker version of Theorem 6 (with ^ replaced by j) was proved as a consequence 
of Theorem 4 in [3] . The improvement in the bound will come from a combination of 
Theorem 5 and the use of a probabilistic variant of the Szemeredi- Trotter theorem. 

We will prove Theorem 5 in the next section, and the proof of Theorem 6 and 
Corollary 1 will come in the following section. The remainder of the paper will be 
devoted to conjectured extensions of both results. 

3. The Proof of Theorem 5 

As in the proof of Theorem 4, we begin by dividing the vectors y into two classes 
based on how many coordinates of Ay are equal to 0. 

Definition 1. A vector y is typical if at least r^~3 entries of Ay are nonzero. 
Otherwise it is atypical. 

Theorem 5 is an immediate consequence of the following two lemmas. 

Lemma 1. IfP{x^Ay = f{y) A y is typical ) > \r~^^'^, then the conclusions of 
Theorem 5 hold. 

Lemma 2. If'P{y is atypical ) > ir~^+'^, then the conclusions of Theorem 5 hold. 

Remark 4. If we consider a form which factors perfectly as Ay = g{x)h{y), then 
the hypothesis of Lemma 1 corresponds to the case where g{x) is very structured 
(concentrated on a single value with probability close to r~^/^), while that of Lemma 
1 corresponds with the same property holding for h{y). 



We will examine each lemma in turn. 
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3.1. The proof of Lemma 1. We will assume throughout this section that A is 
a matrix such that 

P{x'^Ay = f{y) A y is typical ) > 

It follows from Lemma 1 that for any yo which is typical we have 

P,(ar^Ayo = /(yo)) <r-^+i. (2) 

Our argument will go roughly as follows: Under our assumptions, wc know that 
there must be many typical yo for which (2) is not too far from equality. By 
Theorem 3, we know that for such j/g the coordinates of Ayg must be very highly 
structured, in the sense that all of them except for a small exceptional set must lie 
in not too long an arithmetic progression. 

The difficulty is that the exceptional sets in Theorem 3 may be different for different 
yo- However, there will still be many "small" (of size much smaller than n) sots of 
coordinates which will lie entirely outside the exceptional set for most y. We will 
show that such sets correspond to small collections of rows in A which are very 
close to being multiples of each other, and then aggregate those collections to find 
our A'. We now turn to the details. 

We will make use of the following (truncated) quantitative description of how em- 
beddable a small group of real numbers is in a short arithmetic progression, which 
can be thought of as a variant of the essential LCD used in [11]. 

Definition 2. The commensurability of a /c— tuple (ai, . . . Uk) of real numbers 
is defined by 

Comm{ai, . . . ak) = max{r~^~^^ , -^}, 

where R is the length of the shortest arithmetic progression containing and every 
Ui simultaneously. 

For example, if a < 6 are positive integers, then, up to the truncation at r~^~^i, 
Comm{a, h) = qcW^~S}' («!, ^2, • • • Ofe) are all drawn from an arithmetic 

progression of length q containing 0, we are trivially guaranteed that Comm{ai ,ak) 
is at least - . We next characterize the "small sets" of coordinates mentioned above 

in terms of this commensurability. 

Definition 3. A /c— tuple {vi,V2, . . .Vk) of vectors is neighborly if 
EyComm{vly, v^y, . . . v^y) > ^r~^+^ 

D 

Fix ko := log^r. Our next lemma states that the number of neighborly tuples is 
quite large: 

Lemma 3. For k < ko, there are at least m''{l — ^ ) neighborly k— tuples such 
that each vj is a row of A. 
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The proof of this lemma will be deferred to a later section. Our next goal will 
be translate the neighborliness of a tuple into structural information about the 
corresponding rows of A. One natural way in which a tuple can be neighborly is 
if the rows in A are themselves small multiples of each other, in which case the 
corresponding coordinates of Ay will always be small multiples of each other. Our 
next lemma states that every neighborly tuple is in some sense close to this example. 

Lemma 4. Let k < kg, and let {vi,V2, ■ ■ - Vk) be neighborly. Then there are unique 
real numbers d2, ■ ■ - dk and sets S2, ■ ■ ■ Sk of coordinates such that 

• For each j , vi = djVj on all coordinates outside of Sj 

• 11^=2 l'^i\Ui=2 "^ili = Oc{r^~^), where \S\i = 1 if S is empty and \S\i = 
min{|S'|,4} otherwise. 

What's important here is that not only does each row differ only in a few places 
from being a multiple of the first row in the tuple (the exceptional sets are of size 
o(r)), but also that the exceptions will tend to occur in the same columns. This 
latter fact will help keep the exceptional sets from growing too quickly when we 
attempt to examine many neighborly tuples at once. Again we will defer the proof 
of this lemma to a later section. 

Together, the above two lemmas state that the matrix A must have a great deal 
of local structure, in the sense that many not-too-large collections of rows are very 
close to being multiples of each other. Our goal will now be to combine these into 
a single global structure. Using Lemmas 3 and 4, we will be able to prove the 
following weakened version of Theorem 5, which allows the number of exceptional 
rows to be proportional to m instead of r. 

Lemma 5. If A satisfies the hypotheses of Theorem 5, then A contains a rank one 
submatrix of size (m — Oe{j^-p)) x (n — Oe{r^~'^)). 

In the following sections we will first prove Lemma 5 assuming the truth of Lemmas 
3 and 4, then leverage that result into the stronger bound required by Theorem 5. 
We will finish the proof of Lemma 1 by proving Lemmas 3 and 4. 

3.2. The proof of Lemma 5 assuming lemmas 3 and 4. Motivated by the 
conclusion of Lemma 4, we make the following definition: 

Definition 4. Let V = {vi, . . .Vk} he a,n (ordered) neighborly fc— tuple. The score 
of V is given by 

k 3-1 

Score{V)=Y,x{S,i\JS,), 
where the Sj are as in Lemma 4 and x{^) is the indicator function of the event E. 

The score is well defined, since the dj and Sj are unique in that lemma. It also has 
the following useful properties 
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• Score{vi . . . Vk) < Score{vi . . . Vk+i)- Equality holds iff Sk+i ^ Ui=i ^i- 

• If (wi, . . . Wfc) is neighborly, then there can be at most 
log4(Oe(r^^'r)) < logr — 1 different j for which the score increases from 

{Vl, ...Vj) to {Vl, . . -Vj + l). 

For a given (ordered) neighborly fc-tuple V = (wi, ■ ■ .Vk) of rows of A with k < kg, 
let S{V) be the collection of all rows v of A such that {vi, . . .Vk,v) is a neighborly 
tuple with the same score as V. Note that for any V, all of the rows in S{V) are 
multiples of vi (and thus of each other) except in the coordinates where a prior 
djVj differed from vi, and the number of such coordinates is at most 
k k J— 1 

\[jS^\=J2\SA[jSi\=0,{r'-^) 

j=2 ]=2 i=2 

by Lemma 4. It follows that we have a rank one submatrix of dimensions 
1 5(F) I X n — Oe(r^~"r). It therefore suffices to show some S{V) is large. Let b be 
the maximal value of |5(F)| over all neighborly tuples of size at most ko — 1. We 
count the number of neighborly A;o— tuples in two ways. 

Method 1: By Lemma 3, there are at least m^°{l — ) such tuples. 

Method 2: We can bound the number of such tuples by first choosing a set J of 
size log r — 1 of places in which the score is allowed to increase, then restricting our 
attention only to those tuples whose scores increase only on J. For each j where the 
score fails to increase from (fi, . . . vj) to (vi . . . Wj+i), there are at most b choices 
for Vj+i. For each other j, there are at most m choices. It follows that the number 
of tuples is at most 

f ko-1 
\log r — 1 

Comparing our methods, we have 

i_!_^<fl] fc^^ 

m \m J 

Using the relationship e(-'^+°':W)^ < I — x < e^^, we have 

Taking logs and using the definition of ko gives 

m-b ^ logTOlogfco + (1 + 0(1))^ ^ 1 
m ~ ko- logr log'^r 

It follows that b>m — 0{j^^), so we are done. 

3.3. The proof of Lemma 1, from Lemma 5. We construct our rank one 
submatrix using the following procedure. Let Aq be a rank one submatrix of A 
of size (m — 0{j^^)) x (n — 0{r^~'^)) (such a matrix is guaranteed to exist by 
Lemma 5). We initialize Xi C {a;i, . . . a;„} to be the variables corresponding to the 



m 



logr^feo — logi 



log r 
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rows of Ao, and X2 to be the remaining variables, and X3 to initially be empty. 
We also initially set Yi to be the variables corresponding to the columns of Aq. We 
now repeatedly follow the following procedure: 

If the matrix corresponding to {Xi U X2) x Yi has rank one, stop. If this is not the 
case, choose Xj G Xi,Xj G X2, and yk,yi & Yi such that aikaji ^ auakj- Move Xj 
from X2 to X3, and remove yk and yi from Yi. 

We can always find the necessary Xi and Xj since the matrix on Xi x Yi will always 
be a rank one matrix due to our choice of Aq. It remains to check that this procedure 
in fact terminates after at most Q( j^^5^ ) steps, so that the final rank one matrix 
is sufficiently large. Let us assume to the contrary that this does not occur. 

Let S he a set of size r formed by taking j^p^ variables from X3 and r — j^^b^ 

variables from Xi, and let T be the remaining variables in X. Let A be the 
submatrix of A consisting of the rows corresponding to S. We can write 

x'^^y - f{y) ^ x^Ay - g{y, xt), 

where xg (resp. xt) is the vector of variables in S (resp. T) By assumption we 
have 

r-i-^ < V{x'^Ay = f{y)) 

= F,T{'Ps{xsAy = g{y,XT)) 
< supP{xgAy = g{y,XT))- 

XT 

It follows from Lemma 5 that A must contain a rank one submatrix of size 
(r — Q( iogtir )) X (n — 0(r^~^)). Since the number of excluded variables is much 
smaller than , \ , there must be a variable x,; G X-t such that both x-j and the 
corresponding yk and yi are contained in this submatrix, as well as some variable 
Xi' e Xi. However, this is a contradiction, as ai'kaji ^ ai'iUkj- 

3.4. The proof of Lemma 3. We define gy and Dy as follows: 

• If y is atypical, then gy =0 and Dy = {1, . . . m} 

• If 2/ is typical and no arithmetic progression of length at most r 2~4 contains 
at least m — r^~i of the elements of Ay, then gy = r~2 + l and Dy = 
{l,...m} 

• Otherwise, let R be an arithmetic progression of minimal length containing 
and at least m — r^~i elements of Ay. We define gy = and Dy to 
be those i such that the i*^ coordinate of Ay is in R. 

Note that in this definition the Dy are not uniquely determined. We choose one 
arbitrarily for each y. Furthermore, by construction, for any fc— tuple contained in 
Dy, we have Comm{a\, . . . ak) > gy 

By viewing the Inverse Littlewood-Offord Theorem 3 in the "forward" direction we 
can now obtain the following: 
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Lemma 6. For every fixed ^ < \ there is an ro > Q such that for all matrices A 
with r > ro and all typical y* we have 

^{x'^Ay = f{y)\y = y*) < r-^+^Qy,. 



Proof (of Lemma 6): Since by construction gy, > r l + l ^ there is nothing to prove 
unless the probability in question is at least r^^+'s", which we will assume to be 
the case. Let ri be the number of nonzero coefficients of x'^ Ay* , viewed as a linear 

form in x, and let 'P{x'^Ay* = f{y*)) = r^^ Since y* is typical, ri > r^~i. In 
particular, this implies that eo < |. 

Applying Theorem 3 to this form with a = |, we see there is an arithmetic pro- 

1— - 

gression containing all but * coefficients and of length 



2^4, 



1 Fix^Ay* = /(r)) 

r(-i+f)(i-f) 

< 



Fix^Ay* = f{y*)) 

r 2+8 _e£ 

I>{xTAy*=f{y*)f 
If follows that Qy* > ri~^P{x'^Ay* = f{y*)) as desired. 



Taking expectations over all y, we see that 

P{x'^Ay ^ f{y) A y is typical ) < r-^^^'^'Eyigy), 
which combined with the hypothesis of Lemma 1 in turn implies that 

^y{9y) > r-^+* (3) 

Let Z be the collection of fc— tuples satisfying 

Ej/(<7(y)x({ai, ■ • ■ flfc} ^ Dy)) > ^Eyigy) 
By (3), every tuple in Z is neighborly. It remains to check that |.^| is large. 

Since by construction \Dy\ > m — i for every y, we have 

Ea^,...a„'E,y{gyXi{ai,...ak} C Dy)) = Ey{gyP{{ai, . . .ak} e Dy)) 

> ( )'E,(5,). 

Combining this with the definition of Z, we have 

\Z\Eyi9y) + 5^(m'= - \Z\) > (m - ri-t)'=E,(5,) > m\l - :^)E,(5,) 
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Solving the above inequality, we obtain 

3fc 



z\ > m'^n - — ) > m^n ) 

' - ^ 2m'- ^ m ' 



and we are done. 



3.5. The proof of Lemma 4 for k = 2. . Let (a, b) be a pair of neighborly 
vectors. Our goal will be to show that they are very close to being multiples of 
each other. 

We make use of the general fact that for any random variable X taking values 
between and 1 

E(X) = / P(X > u)du = / ^ ^ *' dt (4) 

Ju=0 Jt=l t 

In our case X will be Conim{a'^y,b'^y), so bounding the right hand side becomes 
a question of how likely it is for a^y and b^y to be embeddable in a progression of 
a given length. We make the following further definitions: 

Definition 5. A pair (Zi, of integers is degenerate for the vector pair (a, b) if 
ha and hb agree in at least n — ^ positions and at least one of h and I2 is nonzero. 

Note that there is (up to multiples) at most one degenerate pair for (a, b) . 
We further define for an integer q 

Pab{q) ■■= P{3{hj2) (0,0)|(Zi,Z2) is non-dcgcncrate A Zia^y = hb^y A \h\, {hi < q) 
Using these definitions and the definition of Comm{a,b), we have 
< 'Ejy{Comm{a^y,b^y)) 

— W ~^^2 — + ''"^^^ + P(^oa^y = Zo^^y for a degenerate (fco, Zo)), 

Jt=\ q 

The middle term on the right hand side is negligible, and we will next show that 
the first term is also small by showing 

Lemma 7. For any positive a > 0, any q < -y/r and any a and b, there is a constant 
Ca dependent only on a such that pab{q) < 1/2 -a ■ 



We may without loss of generality assume Cq, > 1. It follows that for any < a < ^, 
assuming Lemma 7, we have 

rpabiq), , r'"'"" . , r ^q 

= 0„(r"^/^+"lnr) 

By taking a sufficiently close to 0, we see that for large r the contribution from the 
first term is also o(r~2+"^). 
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It follows that the dominant contribution to the expectation must come from the 
third term. This implies that a degenerate pair {ko, lo) exists, and that we further- 
more must have. 

P{koa^y = lob^y) > 

It follows by the linear Littlcwood-Offord lemma that tlw^ linear form {koo^ — 
lob^)y must have 0{r^~'^) nonzero coefficients. But this is exactly what the lemma 
requires. 

3.6. The proof of Lemma 7. It suffices to prove the following: 

Lemma 8. Let ai, . ..an and hi, . . .b„ be fixed (real or complex) constants such 
that for each i at least one of ai and bi is non-zero. Let Xi,. . .Xn be iid vari- 
ables uniformly chosen from {—1,1}. Let Eg be the event that there exist u and v 
satisfying 

• \u\, \v\ < q 

• There are at least different i for which vai ^ uui . 

• vY, aiXi = uY, biXi 

Then for any a > and any 1 < q < \fn, 

V n 

where the constant implicit in the O notation is as n tends to infinity and may 
depend on a. 

We will throughout assume that both q and n are tending to infinity. By utilizing 
a Preiman isomorphism of order 2n^ (see for example [14], Lemma 5.25), we may 
assume that the a, and the 6j are all real integers. Wc may furthermore without 
loss of generality assume for every i either 6, is positive or bi = and ai is positive. 

Let fc be a positive integer satisfying that k > ^. We define Lq = 1 and for 

^ < j < k, wc define 

Lj = sup |{(«i, . . . ij) : Oil H \- a^. = c A bi^ -\ h6j^. =c/}| 

(c,d)eC^ 

Clearly 1 < Lj < , and by treating ij as fixed we furthermore see that 

Lj-i < Lj < nLj-i. This implies that one of the following two cases must hold 

• There is a j between 1 and k for which Lj > " Lj-i 

q2W+T 

• Lk < \k 

We handle each case separately. 

, _ 2fc 

Case 1: Lk < n q ^fc+i . Here we will make use of the following result of Halasz 
(implicit in [5], see also [14]): 
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Theorem 7. Let k > be fixed, and let be nonzero (real or complex) 
coefficients. Let Rk be the number of 2k— tuples {ii, . . .ik, ji, ■ ■ ■ jk) for which at^ + 
h ajj. = flji H h Uji^. . Then for Xi uniformly chosen from {—1, 1}, 

n 

ViY^aiXi =c)= 0{n-^^-^I^Rk). 

i=l 

Combining the above result and the union bound, we can write 

n 

P{Eg) < - biu)xi = 0) 

iu,v) «=1 

= 0(n-2'=-i/2) ^ ^ ^^y(^a,, + ■ ■ ■ + a,, - a,, a, J = -u(6,, + • • • 

= 0(n-2'=-V2) J2 ^ xKai, +--- + ai, -a,, aj,) = -u{bi, + ■ ■ ■ 

(>l.---«fc) {u,v) 

where the sum is taken over all pairs {u,v) such that Q < u < q, \v\ < q, 
GCD(u. v) = 1, and at least different i satisfy vhi ^ vcij. This last assump- 
tion guarantees that the linear form in the first inequality has at least O.ln nonzero 
coefficients for every {u, v) we are summing over, so that the Halasz bound above 
will be sufficiently strong. 

In the final term in the above bound, the inner summand is at most 1 unless 

(a, + (a, b)i^ H h (a, b)i^ = {a, b)j, + (a, b)j^ H h (a, b)j^, 

an equation which has at most Lku'^ solutions. 

It follows that 

PiE,) = Oiq^n-''-'/^Lk + n-'/^) 
which by our assumptions on Lfe is 0{^—^ — ) = 0(^"~^) 

Case 2: i,- > " -^-i-i We know that each variable can be involved in at 

q2fc+T 

most Oj{Lj-i) different j— tuples which sum to the same value. It follows that 
in this case for some absolute constant C, we can find a collection S of Cj " 

disjoint j— tuples, each of which has coefficients summing to the same (fixed and 
non-random) pair (c, d). By our assumption on the bi, and a,, we know that either 
d is positive or d = and c is positive. In particular, we know that at least one of 
c and d is nonzero. 

Define a j— tuple {ii, . . . ij) to be agreeable ifxi^ = Xi^ = ■ ■ ■ = Xi. . Note that each 
tuple has a constant probability 2^~^ of being agreeable. Let S' be the collection 
of tuples in S which are agreeable, and let B be the event that \S'\ > 2~^ |S'|. We 
have 
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Note that the agreeabihty of each tuple in S is an independent event due to 
our assumption that the tuples are disjoint. It follows by Chernoff's bound that 
P(-ii3) = o(n~^/^). We therefore focus on the second term. 

To bound P{EgAB), we will expose the variables by first exposing S", then exposing 
the value of all the variables not involved in a tuple in S". We will then finally expose 
the values of the variables in S'. 



We have for any tuple that 

3 

^QZi^h'^h) = ic,d,)\{ii,...ij) agreeable ) = 1/2 

i=l 

and the same for (— c, —d). It follows that, treating the set S' and the value of Xj 
for variables not in S' as fixed, 

ELl O-i^i ^Ljll !Jj + -1 



where z\ and Z2 are fixed constants and the j/j are independent ±1 variables. By 
paying at most a constant multiplicative factor and an exponentially small additive 
factor in the probability, we may replace the sum of the yj by a uniform distribution 
on [— 2-\/|S'|, 2-^|5'|]. We are thus essentially reduced to bounding the probability 
that ^^T^ can be written as a fraction with low numerator and denominator. We 
will soon show: 

Lemma 9. Let n > I be an integer, and let a,b,c,d be real nvmbers (which may 
depend on n) such that ad ^ be. Let a > be any fixed parameter. Then for any 
1 < q < n, there are at most qn" integers z G {1, . . .n} such that 

az + b 

hiz) := 

^ ' cz + d 

has height at most q (has numerator and denominator at most q in absolute value 
when written in lowest terms). 



Assuming Lemma 9 to be true, we know that for fixed zi, Z2 the probability that 

1 

this fraction can be written as ^ 7^ ^ is at most ■^j=^- Taking expectations over 

all zi,Z2,S' and using our bounds on S' under the assumption that B holds gives 
that 



n 



P{Eq AB)< -r= + P(^-r^ = -Adtti-cai^O for— different i) 

The second term on the right side corresponds to a linear form with ^ nonzero 
coefficients, so is 0(n~^/^). Again the result follows. 

It remains to prove Lemma 9. 
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3.7. The proof of Lemma 9. ^ 

We may without loss of generahty assume that \a\ > \c\. We will further assume 
without loss of generality that no prime divides all of a, b, c, d. 

Let A = \ad — bc\ > 0. Note that any common divisor of + 6 and cz + d is also a 
common divisor of \a{cz + d) — c{az + b) \ = A. Let t(A) be the number of divisors 
of A. We will split into two cases. 

Case 1: t(A) < n"/^. For < i < (a + l)log2n, let Si denote the set of 
z G {1, . . . n} such that \az + b\ G [2', 2'+^]. It is clear that each Si lies in the union 
of two intervals, each of which has size at most 2*. For any z £ Si such that h{z) 
has height at most q, it must be the case that az + b shares a divisor v with cz + d 
and A such that v > We next claim that for any given v, there are not many 

V for which this can occur, as: 

Claim 1. If v\GCD{az\ + b, cz\ + d) and v\GCD{az2 + b, CZ2 + d), then v\zi — Z2- 

Proof Let p be a (fixed) prime dividing v, and let p™ be the largest power of 
p dividing v. If p does not divide a, then p"* must divide zi — Z2, since v\{azi + 
h) — {az2 + b) = a{zi — Z2). Similarly, either divides zi — Z2 or p also divides c. 
However, p cannot divide both a and c, for it would then follow that p also divided 
{azi + b) — azi = b and d, violating our assumption that a, b, c, d shared no common 
factor. Therefore it must be the case that p^\zi — Z2- But this is true for any 
prime, so we are done. ■ 

It follows that for a given v, there are at most 2'+^/w choices of z for which v 

provides the required cancellation. Adding up over all v, we see that the number 
of z G Si which lead to a height of at most q is at most 



Adding up over all Si, we see that the lemma holds in this case. 

Case 2: r(A) > n"/^. In this case it follows from classical number theoretic bounds 
on the number of divisors of an integer that A > 2'^^"^ for some a;(n) tending to 
infinity with n. 

Recall that we are assuming that \a\ > |c|, so in particular a is non-zero. By paying 
an (additive) factor of at most 2n"/^, we may therefore only consider values where 
\az + b\ > n"/2. 

The result will follow immediately if we can show that for any interval of length 
at most v}~°'/'^ /q, there can be at most three such values of z in that interval for 
which that h{z) has height at most q. Let us then assume to the contrary that 



Many of the key ideas in the proof of Lemma 9 are due to Ernie Croot 



.,|A 
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there are four values 21,2:2,23,-24 in such an interval for which h{z) has height at 
most q. 

Let Ui = azi + 6, let Vi = czi + d, and let h{z) = u^/v[ be written in lowest terms. 
We next make the following claim: 

Claim 2. Let Ui,u\,Vi,v[,Zi he as above. Then 

U1U2U3U4 / , / N 

< lC'm(Ui,U2,U3,U4) 



Hi 



<i<j<4 



Zi - Zn 



Proof Since GCD{ui, Uj)\[{azi + b) — (azj + b)] = a{zi — Zj), and GCD{a, 6) = 1 
by assumption, it follows that GCD{ui,Uj)\zi — zj. We therefore have 

U1U2U3U4 U1U2U3U4, ^11 \ 

< — r- < lcm{ui, U2, Us, U4) 



ni<i<j<4 1^* - ni<i<i<4 GCD{ui, Uj) 



Combining this with the observation that lcm{ui,U2,Us,Ui)\Mcm{u'-^^,u'2,u'^,u'4), 
we see that 

> lcm{u\,U2, u'3, > U1U2U3U4: j-g-j 



ARi 



<i<j<4 



Zi - Zi 



We now divide into two further cases depending on the size of a relative to A. 

Case 2a: \a\> A^/^. Then 

\uiU2U3U4\ > > q'^n'^A > g^A JJ^ \zi — Zj\, 

l<i<j<4 

which is a contradiction to (5). 



Case 2b: |a|, |c| < A^/^. Let M be the larger of |6| and \d\. It follows from our 
bounds on a and c that M > ^A^/^. 

Note that for any 2 in our range we have 

max{|a2 + b\, \cz + d\} > M - ^Jn^l^ > 

where we are here using our lower bounds on both M and A. It follows that 
GCD{azi + b, czi + d) > ^Mn~^, from which we know that 

\b\ > \Mn-^ - \azi\ > ^Mn'^ 

and a similar statement for \d\. In other words, both b and d would have to be 
much larger than both \a\ and |c|. This in turn would imply 

'c2 + d rf' ' - ^ 2n2- 

But an interval of width less than ^ can only contain at most one fraction of height 
less than n, since any nonzero difference between two such fractions is at least that 
large. We again reach a contradiction. 



BILINEAR AND QUADRATIC VARIANTS ON THE LITTLEWOOD-OFFORD PROBLEM 17 



3.8. The proof of Lemma 4 for k > 2. Let {vi,...Vk) be a neighborly tuple. 
We first modify the definition of Commensurability slightly, writing 

k 

Comm*{a\, .. ., Ofc) = Comm,{ai, . . . afe)x(]^ a, ^ 0). 

i=l 

We have by Theorem 1 and the fact the Commensurability is always at most 1 that 

Ey{Comm*{v'(y,...vly)) > EyiCommivfy, .. .v]:y)) - P{ some vfy = 0) 

> Ey{Comm{viy, . . .v^y)) — kr~^^'^ 

1 _l^ic 

> r 2 T" 8 

- 12 

The advantage to this truncated commensurability is that we have the relationship 

^ \ ^ 1 ^ ^1 "2 ak 
Comm. (ai, . . . , Ofe) > — >^ — = = — 

R Z\ Z2 Zk 

for some integers z\. . .Zk which are at most R in absolute value. 
As in the k = 2 case, we have 

Ey{Comm*{vfy,...,vly)) < (T PMdq)+r-i+i (6) 

Jt=i Q 

^p(^ = = . . . = !|l for a degenerate I), 

where 

T 

d- y 

Pv{q) ■= P{31 = {h, ■ ■ - Ik) : 1 is non-degenerate A — — all equal A < q), 

and a fc— tuple (Zi, . . . Ik) is degenerate if [li^ Ij) is degenerate for (w^, Vj) for every i 
and j. Note that a given (tii, . . . Wfe) again has (up to multiples) only one degenerate 
I. 

It follows from the proof of the k = 2 case that for any particular the con- 

tribution to Pv{q) from those tuples where {k, Ij) is nondegenerate is 0{ i/l-c ) for 

any a. Adding up over all pairs, it follows that Pv{q) = Q( ^i/2ia )■ A-S in the k = 2 
case, we now have 



nr 2 " 

Jt=l 



^dq = 0(A;V-i/2+« Inr) = o(r-H¥). 



by taking a to be sufficiently small. Again the contributions from the first two 
terms on the right hand side of (6) are small, so the last term must be large, that 
is to say 



= ^ = . . . = ^^^y ) > (7) 
Ml /2 /fc - 14 ^ ' 
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Let dj = and Sj to be the places where vi differs from djVj. We can rewrite (7) 
as the system 

'^{d2V2{i) - vi{i))xi = 
ieS2 

{dsvsii) - vi{i))xi = -"^{dsvsii) - vi{i))xi 
ieS3\S2 ieS2 



X] {dkVk{i) - vi{i))xi = - {dkVk(i) - vi{i))xi. 

*esi, ieS2U---uS'fc_i 

We now successively expose the variables in Sj\S2 U . . . Sj-i for each j and examine 
each equation in turn. 

After wc expose the variables in S2, the probability that the first equation above 
holds is at most \S2\1 by Theorem 1. We now treat the variables in S2 as fixed, 
meaning that the right hand side of the second equation above is constant, and 
expose those in 5'3\S'2- For any particular value of the variables in 82, it again 
follows from Theorem 1 that the probability that the second equation holds is at 
most |S'3\S'2|i • Continuing onwards through the entire system, we have that the 
probability that the above system holds is at most 

j=2 i=l 

The lemma follows by combining this with (7). 



3.9. The proof of Lemma 2. This proof will follow along very similar lines to 
that of Lemma 1. 

Again we let kg := [log'^rj, and the argument will make use of the following 
analogue of neighborliness: 

Definition 6. A tuple {vi,. . . Vk) of vectors is friendly if 

^{v^y = v^y = ---=vly = 0)> ir-l+^ 



We again have that there are many friendly fc— tuples. 

Lemma 10. Let k < k^. Under the hypotheses of Lemma 2, there are at least 

1— ^ 

m''{l — - — -) friendly k— tuples whose elements are the transposes of rows in A. 



We also claim that friendly tuples exhibit a similar structure as neighborly ones: 
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Lemma 11. Let k < ko, and let {vi, . . .Vk) be friendly. Then there are unique real 
numbers dj such that if Sj denotes the places where vi differs from djvj, then 

k 3-1 

]\\S,\[jS,\,<2r'-^\ 

The proof of Lemma 2 from these two lemmas is exactly the same as that of Lemma 
1 from Lemmas 3 and 4. We will therefore focus on the proofs of the two lemmas, 
which will again turn out to be similar to the proofs of the corresponding lemmas 
for friendly tuples. 

3.10. The proof of Lemma 10. We define Z to be those fc— tuples satisfying 

P{v[y = wjy = • • • = vly = OAy atypical) > ip(j/ atypical). 

By our assumptions about A every tuple in Z is friendly. Now consider a tuple 
{vi, . . .Vk;y) where the vf are chosen randomly from the rows of A and the y is 
uniform and random. We estimate the probability that y is atypical and vjy = 
for every j in two different ways. 

Method 1: For any atypical y, there are at least {m — r^~i)'' choices for the tuple. 
It follows that the probability is at least 

(m — r^~3)''„, . , , 
T P{y atypical ) 

Method 2: Wc first choose the fc— tuple, then bound the probability that y works 
based on whether or not the tuple is in Z. Doing this gives that the probability is 
at most 

+ _ |Z|))P(y atypical ). 

The result follows by comparing the bounds from the two methods, along with the 
bound 

(m - ri-t)'= > m''(l - ) > m'=(l - ). 

TO TO 

3.11. The proof of Lemma 11. We first note that for any j, we can view 
the system vfy = vJy as a single vector equation '^^Wtyi = in B?, where 

Wi =< vi(i),Vj{i) >. Since by assumption this equation is satisfied with probabil- 
ity ir~^+', it follows from the 2-dimensional Theorem 2 of Halasz that there must 
be a 1-dimensional subspace containing all but 0(r^~^) of the Wi. In terms of the 
Vj, this says that for each j there is a multiple of Vj differing from vi in at most 
r^"*^ places. We will take those multiples to be our dj, and Sj to be the places they 
differ. 
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The relationship vfy = V2 j/ = • • • = v'^y = is equivalent to the system 
'^{d2V2{i) - vi{i))xi = 

^ {d3V3{i) - vi{i))xi = - ^ {d3Vs{i) - vi{i))xi 
ieS3\S2 ieS2 

^ {dk{i) - vi{i))xi = - ^ {dk{i) - vi{i))xi. 

<5?S2U-.-US^_l 

i^S2U---USfc_i ieS2U...5fc_i 

since the first fc — 1 equations each represent djvJy = vfy for some j, and the last 
equation represents vfy = 0. As in the proof of Lemma 4, we expose each variable 
in 52, then the remainder of 53, then the remainder of ^4, and so forth. After all the 
variables in ^2 through Sj have been exposed, the probability that the remaining 
variables in S^+i cause the next equation to be satisfied is at by Theorem 1 most 

\Sj+,\(jSi\-'/\ 

j=2 



Since each Sj contains at most r^~'^ elements, it follows that there must be at 
least r/2 variables still unexposed by the time we expose Sk and arrive at the last 
equation. Therefore the probability this last equation holds is at most 2r~^/^, so 

j=2 i=2 

The lemma follows. 



4. The proof of Theorem 6 



We first note that for any 6, 

Fix'^Ax = L{x) + c) < P(a;^i?e(f/M).'j; = Re{e'\L{x) + c))). 

Since we can always choose a 9 such that e^^aij has non-zero real part for every i 
and j for which atj is nonzero, it suffices to prove the result for the case where the 
entries of A, as well as the coefficients of L and c, are real. We will now assume 
this to be the case. 



The proof will proceed by contradiction. Let us assume that for some 5 and all tq 
there is an r > ro and a matrix A of such that Pa > r~^^^~^^ and every row of A 
has at least r nonzero entries. 
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We will use a decoupling argument to relate probabilities involving P^t to a prob- 
ability involving By for a suitable bilinear form B. We will then combine those 
bounds with Theorem 5 to obtain 

Lemma 12. Let A he a matrix satisfying the hypotheses of Theorem 6 such that 
> 7--i/2+<5_ Then there is a principal minor A' of A of size at least n — Q( ^^°f " ) 
and a rank one matrix A" such that A' = A" everywhere off the main diagonal. 



This allows us to essentially reduce to the case where A is rank one. Let us (for 
now) assume that this lemma is true. 

Without loss of generality we may assume that A' consists of the first m rows and 
columns of A. Let z= {xi, . . .Xm)^ ■ For any particular values of x^+i, ... we 
have the relationship 

x^Ax = z^A'z + L{z) + c', 

where L and d are dependent on the exposed variables. Because a;f = 1 for every 
i, we can further replace A! by A" by changing c'. It follows that 

^{x'^Ax = L{x) + c) < fi\^pP{z^A"z = L{z) + c') 

L,c' 



Since A" has rank one, the quadratic form z'^A"z factors as the square of a linear 
form. Since we only removed O C^ogf" ) columns in going from A to A' , it follows 
from our assumptions on r that for sufficiently large n every coefficient of that 
linear form must be nonzero (as A" still has at least | nonzero entries per row). 
We will soon show 

Lemma 13. Let 6i, . . . bm, ci, . . . Cm, d be real numbers such that all of the hi are 

nonzero, and let a > 0. Then 

m m 
i=l i=l 



Combining Lemma 13 with Lemma 12, we see that if for sufficiently large n we have 
Pa > r~^/^+'', then we also have Pa = 0(r~^/^+''/^), which is a contradiction. We 
now turn to the proofs of the lemmas. 



4.1. The proof of Lemma 13. We define 

LfJ LfJ 

t\ = biXi si = 



i=l 



t2 = E ^i^i ^2 = E 

i=LfJ+i i=L?J+i 
In terms of these new variables, we are attempting to show 

P(2tii2 + tl+tl = si+S2 + d) = 0(m-i/2+t-)_ 
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The left hand side of (8) can be thought of as the probabihty that the point p and 
the Hne I are incident, where 

p = {ti, S2 - tl), 1 = {y = 2tix + tl- si+ d}. 

Note that p and / are independent, as they involve different sets of variables. We now 
make use of the following probabilistic variant of the Szemeredi- Trotter theorem, 
which is essentially a rescaling of the weighted Szemeredi-Trotter result of losevich, 
Konyagin, Rudnev, and Ten [6]: 

Theorem 8. Let {p,l) be a point and line independently chosen in R^. Let 
Qp := supP(p = po) qi := supP(Z = Iq) 

Po lo 
Then the probability that p and I are incident is bounded by 

F{p€l) = 0{{qj,qi)-'/'' + qp + qi) 

Since p uniquely determines t2 and I uniquely determines ti, it follows from Theorem 
1 that qp and qi are at most 0(m~^/^). We are therefore done unless 

qpQl > n-3/2+«. (9) 

If (9) holds, it follows that there is some point po which is chosen with probability 
at least From the definition of p, we know that there are real numbers 

and So such that 

P{t2 = io A S2 = So) > 

If follows from the d = 2 case of Halasz's Theorem 2 that the coefficient vectors of 

t2 and S2 must be close to being multiples of each other, that is to say there is an 
\S\ C {[^J + 1, . . .m} with l^l > ^ and a real number cq such that cj = bjCo for 
every j G S. 

We now expose every variable not in S. Once we have done so, we are left with an 
equation of the form 

(II bjXj + dif = co(^ bjXj) + d2, (10) 
jes jes 

where di and d2 are constants depending on the exposed variables. For any given 
di and d2, there are at most 2 values of J2jes ^i-'^J ^'^^ which (10) holds. It therefore 
follows from Theorem 1 that for any given di and d2 the probability that (10) holds 
is 0(m~^/^). Lemma 13 follows from taking expectations over all di and ^2- 

4.2. The proof of Lemma 12. Wc will make use of the following "decoupling" 
lemma (Originally proved in [12]) to reduce from the quadratic case to the bilinear 
one. 

Lemma 14. Let Y and Z be independent variables, and let Z' be a disjoint copy 
of Z. Let E{Y, Z) be an event depending on Y and Z. Then 

P(i;(y, Z)f < P{E{Y, Z) A E{Y, Z')) 
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In our case this implies that if X = {xi, . . .Xn] is a collection of independent 
Bernoulli variables partitioned into two disjoint subsets Y and Z, then 

n n 

'P{x'^Ax = L{x) + cf = ^ aijXiXj = L{x) + cf 

i=i j=i 

n n n n 

- ^(^^°-^J^i^j = ^i(y) + -^2(-2) + cA^^atjXiXj = Li{y) + ^2(2') + c) 

i=l j = l i=l j=l 

i=l j=l i—1 j—1 

where Xj = Xj if j € F and Xj = Xj if j G Z, and L{x) = Li{y) + L2{z) is the 
natural decomposition of L into the sum of linear forms on y and z. 

Let us further suppose that \Y\ = \Z\ or |y| = \Z\ + 1. All terms only involving 
variables in Y disappear from the right hand side of this last inequality, and we 
have 

Pix'^Ax = L{x)+cf <Pi2 ^ aijXi{yj-y'j) = Li{y)-Li{y') + Q{y,y')), 

where Q is another quadratic form. By assumption the left hand side of this 
equation is at least r~^~^'^^ , while the right hand side has the form x^By = f{y). 

If we further knew that for every i G Y there were at least | different j & Z such 
that Aij ^ 0, it would follow from Theorem 5 that the matrix B must contain a 
rank 1 square submatrix of size n — Os ( ^^^s ^ ) • With this observation in mind, we 

make the following definition: 

Definition 7. Given a quadratic form A, a partition {xi . . . Xn} = Y U Z oi the 
n variables into two disjoint subsets is balanced if for every Xi &Y there are at 
least r different Xj G Z for which a,j 7^ 0. 

In terms of our original A, we know that for any balanced decomposition of the 
variables into two equal parts Y and Z, the submatrix corresponding to y x Z is 
equal to a rank one matrix except for a few rogue variables. Our next goal will be 
to play many such decompositions off of each other. 

Since the reduction to a bilinear form only gives us information about the entries 
in y X Z, we will want to choose a collection of balanced decompositions such that 
many different entries appear in this submatrix for some element of the decompo- 
sition. Motivated by this, we make the following definition: 

Definition 8. Let = (Yi, Zi) . . . (Ym, Zm) be a collection of balanced partitions 
of a set X = {xi, . . .Xn} into pairs of disjoint subsets of equal size. We say 
shatters X if for every i ^ j ^ k ^ I there is a r = r{i,j, k, I) such that i,j G Yr 
and k, I G Z^. 

In terms of our decoupling, a shattering collection of partitions means that every 
pair of off-diagonal entries aik and aji will appear simultaneously in the bilinear 
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form for some element of JF. We next show that we don't have to consider too 
many partitions at once 

Lemma 15. // \X\ = 2m, there is an T of size at most [ur^r^^yl < 83 Inn which 
shatters X. 



Proof Let a \!F\ of size [ 111(^7/16) 1 formed by independently and uniformly 
choosing {Ys,Zs) from the set of all partitions of X into two parts of equal size. 
For any given quadruple {i,j,k,l), the probability that contains while 
contains {i,j} is at least j^, and these events are independent over all r. It therefore 
follows from the union bound that the probability that X fails to be shattered by 
this collection is at most 

'^^(Y^)'"^' + P(some {Yg, Z^) is not balanced). 

The first term is 0(i) by our choice of For the second term, we note that by 
standard large deviation techniques the probability that for any given s and i that 
Xi G Ys and there are at most | nonzero a^- with 7 € Zg is ©(e^*"/^). It follows 
from the union bound and our assumption that on r that the second term is also 
0(1). Since a random collection almost surely shatters X, there must be at least 
one shattering collection. ■ 



We now fix some .Fq which shatters our original set of variables and has size at 
most 83 log n. For each r, we know from Theorem 5 that we can find exceptional 
sets y; QYs.Z'^ Zs with ly^U^sl = O(j^) such that the submatrix of A 
corresponding to x (Z\Z'^ has rank one. Let 

Without loss of generality we may assume that W = {xn-t+i, ■ ■ ■ Xn}- By assump- 
tiont = 0(l||f). 

For any 4 distinct elements {i,j, k, I) disjoint from W, we know from the definition 
of J^o and W that for some s the 2x2 submatrix of A on {i,j} x {k, 1} appeared in 

a rank one submatrix of Ys x Zs- It follows that for every set of distinct (i, j, k, I), 
we have aikUji = ajkau. In particular, for every pair (j, I) with 3<k^l<n — t, 
we have 

aji=aii — . (11) 
ai2 

We can therefore take A' to be the principal minor of A on {xs, . . .Xn-t}, and A" 
to be the matrix for which the right hand side of (11) also holds for j = I. 



4.3. The proof of Corollary 1. Construct a graph whose vertices are the vari- 
ables Xi, with XiXj for i 7^ j iff fly arc nonzero. By assumption, this graph has 
average degree at least m—1. It follows that it must contain a subgraph of minimum 
degree at least In matrix terms, this implies that A contains a principal minor 
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A' such that every row of A' has at least ^^^^ nonzero entries. Without loss of gen- 
erality we may assume that the minor corresponds to the variables x = {xi, . . .Xk}- 
For any fixed value of Xk+i ■ ■ - Xn, the equation xtAx = L{x) + c becomes 

x^ Ax = L{x) + c, 

an equation which holds with probability Os{m~i~^^) by Theorem 6. The result 
follows from taking expectations over all values of Xk+i ■ ■ ■ Xn- 

5. Extensions of the Main Results and Conjectures 

5.1. Inverse results for more weakly concentrated Bilinear Forms: It is an 

interesting problem to consider whether there are similar inverse results holding in 
general for when a bilinear form has polynomially large concentration on one value 
P{x^Ay = c) > for some b. 

There are at least two different types of structure that lead to sufficient conditions 
for this to occur. One possibility is algebraic: If the coefficient matrix has low rank, 
then f{x, y) will be equal to whenever a small number of linear forms is equal to 
0, which may not be too unlikely an event if some of those forms are structured. 
For example, if A is chosen to satisfy Uij = f(i) + g{j) (for arbitrary / and g), then 
x^ Ay can be expressed as 

{xi+X2-\ h x„){g{l)yi H h g{n)y„) + {f{l)xi H h f{n)x„){yi H h 2/n) 

and is whenever + • • • + a;„ = j/i + • • • + y„ = 0, an event which occurs with 
probability approximately 

Another possibility is arithmetic: If the entries of the coefficient matrix are all 
drawn from a short generalized arithmetic progression of bounded rank, then the 
output of x^ Ay will also lie in such a progression, and will by the pigeonhole 
principle take on a single value with polynomial probability. We conjecture that 
these two ways, and combinations thereof, are essentially the only way a bilinear 
form can have polynomial concentration, that is to say 

Conjecture 1. For any a > there are constants 01,02,03 and Nq such that for 
all n > Nq the following holds: If A is an nxn matrix of nonzero entries such that 
for X and y uniformly and independently chosen from, {—1, 1}", 

sup'P{x^ Ay = c)> n"", 

c 

then A can he written as A1+A2+A3, where Ai has rank at most oi. the entries 

of A2 are drawn from a generalized arithmetic progression of rank at most 02 and 

2 

volume at most 03, and A^ contains at most j^-^ nonzero entries. 

5.2. Higher degrees. In this section we give several conjectured extentions of the 

main results to this paper to multilinear and polynomial forms. We begin with 
the following (simplified) analogue of Theorem 4, which can be proved by the same 
method. 



26 



KEVIN P. COSTELLO 



Theorem 9. Let k be a fixed positive integer. Let yi = {xi^i, . . .Xn,i), ■ ■ - l/k = 
(a;i,fe, . . . Xn,k) be n independent vectors uniformly chosen from {—1, 1}", and let 

n n n 
A[x) := ^ ^ y ^ • • • y ] '3^iii2...ifc^ii,l • • • ^ik,k 

il=li2=l »Je=l 

be a k— multilinear form whose coefficients ai^...i^. are all nonzero. Then for any 
function f of k — 1 variables, 

P(A(2/i, ...yk) = /(2/2, . . . Vk)) = Ou{n-^/^) (12) 

Again, this is tight for degenerate forms which contain a linear factor. A natural 
conjecture would be that non-degenerate forms are significantly less concentrated. 

Conjecture 2. Let k,A,y, and f be as in Theorem 9. If there is some e > such 
that 

PiAiyi,...yk) = f{y2,---yk))>n-i+% 

then there is a partition of {j/i, . . .yk} into disjoint sets S and T and functions fi 
and /2 such that fi depends only the variables in S, /2 only on the variables in T, 
and A differs from /1/2 in o(n^) coefficients. 

The in this conjecture comes from how n'^/^ is the typical magnitude of / in 
the case where the coefficients of A are random (small) integers. 

We can also conjecture a polynomial analogue to Theorem 6, including an analogous 

inverse theorem to the above multilinear one. 

Conjecture 3. Let Xi, .. .Xn be independent and uniformly chosen from {—1, 1}, 
and let 

l<ii---<ik 

be a degree k homogeneous polynomial with at least mn''~^ nonzero coefficients. 

Then 

supP(/(xi,...a;„) =c) =0(m-i/2)_ 

c 

// the above concentration is at least r2fe(m~'^/^+'^), then f differs in only a few 
coefficients from a polynomial which factors. 

In [2], a proof of the first half of this conjecture was given with m~^/^ replaced by 

^ . For the second half, we do not have a proof of this conjecture even in 
the case k = 2. 
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